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ABSTRACT 

We demonstrate that the multicanonical approach is not restricted to Monte Carlo 
simulations, but can also be applied to simulation techniques such as molecular dynam- 
ics, Langevin, and hybrid Monte Carlo algorithms. The effectiveness of the methods 
are tested with an energy function for the protein folding problem. Simulations in the 
multicanonical ensemble by the three methods are performed for a penta peptide, Met- 
enkephalin. For each algorithm, it is shown that from only one simulation run one can 
not only find the global-minimum-energy conformation but also obtain probability dis- 
tributions in canonical ensemble at any temperature, which allows the calculation of any 
thermodynamic quantity as a function of temperature. 
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1. INTRODUCTION 

Simulations in a system with many degrees of freedom by conventional methods such as 
molecular dynamics (MD) and Monte Carlo (MC) can sample only a small portion of 
the entire phase space, rendering the calculations of various thermodynamic quantities 
inaccurate. This is because the energy function has many local minima, and at low tem- 
peratures simulations will necessarily get trapped in the configurations corresponding to 
one of these local minima. In order to overcome this multiple-minima problem, many 
methods have been proposed. For instance, simulated annealing |I[ is one of the most 
widely used algorithms to locate the global-minimum state out of the multitude of local- 
minimum states. The multicanonical approach 0, [3| is another powerful technique. The 
advantage of this algorithm lies in the fact that from only one simulation run one can not 
only find the energy global minimum but also calculate various thermodynamic quantities 
at any temperature. The method was originally developed to overcome the supercritical 
slowing down of first-order phase transitions, [Q, |3| and it was then proposed to be used for 
systems that suffer from the multiple-minima problem such as spin glasses f|] and the pro- 
tein folding problem. || The same method was later referred to as entropic sampling, || but 
the proof of the equivalence of the two methods was given to clarify the matter. [|7| In the 
context of the protein folding problem, the effectiveness of multicanonical algorithms was 
compared with that of simulated annealing. || It was also used to study the coil-globular 



transitions of a model protein, helix-coil transitions of amino- acid homo-oligomers,[10] 
and conformational sampling of a constrained peptide. flTT| 



In all of the previous works the multicanonical ansatz was used in the context of 
Monte Carlo simulations utilizing mostly the Metropolis algorithm |12] to generate a 
Markov chain of configurations. However, other simulation techniques such as molecular 
dynamics fl3| are also widely used. The purpose of the present work is to demonstrate 
that these techniques can be used for simulations in multicanonical ensemble. Here, we 



consider three common algorithms: molecular dynamics, Langevin, and hybrid Monte 
Carlo. |15] The performance of the algorithms are tested with the system of an oligopep- 
tide, Met-enkephalin. 
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2. METHODS 

2.1. Multicanonical ensemble 

Simulations in the canonical ensemble at temperature T weigh each state with the Boltz- 
mann factor 

w B {E,T) = e-f , (1) 

where the inverse temperature is given by j3 = -j^rp with Boltzmann constant ks- This 
weight factor gives the usual bell-shaped canonical probability distribution of energy: 

P B (E,T) <xn(E) w B (E,T) , (2) 

where n(E) is the density of states. 

In the multicanonical ensemble,^ on the other hand, the probability distribution of 
energy is defined to be constant: 

P m u{E) oc n(E) w mu (E) = const. (3) 

The multicanonical weight factor for each state with energy E is then given by 

w mu (E)(xn- 1 (E) = e- s{E) , (4) 

where S(E) is the microcanonical entropy (with ks — 1): 

S(E)=lnn(E). (5) 

With the uniform probability distribution of Eq. (]3|), a simulation in multicanonical en- 
semble leads to a ID random walk in energy space, allowing itself to escape from any 
energy barrier and to explore wide range of the phase space. 

Unlike in a canonical simulation, however, the multicanonical weight w mu (E) is not 
a priori known, and one has to obtain its estimator for a numerical simulation. Hence, 
the multicanonical ansatz consists of three steps: In the first step the estimator of the 
multicanonical weight factor w mu (E) is calculated (for details of the method of finding 
w mu (E) for the case of Metropolis Monte Carlo algorithm, see Refs. |3|, |J). Then one 
makes with this weight factor a production run with high statistics. In this way infor- 
mation is collected over the whole energy range. Finally, by examining the history of 
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this simulation, one can not only locate the energy global minimum but also obtain the 
canonical distribution at any inverse temperature (3 for a wide range of temperatures by 
the re-weighting techniques: [HX 



P B (E, T) oc P mu (E) w~l(E) e-? E . (6) 
This allows one to calculate the expectation value of any physical quantity O by 



dE 0(E)P B (E,T) 

< O > T = J — r • (7) 

J dE P B (E,T) 

In the following subsections, we describe how to implement multicanonical simulations 
for Langevin, molecular dynamics, and hybrid Monte Carlo algorithms. 
2.2. Langevin algorithm in multicanonical ensemble 

The Langevin algorithm ||14|| is used to integrate the following differential equation: 



I J 



where (i — 1, • • • , N) are the (generalized) coodinates of the system, E(q) is the potential 
energy, and r\i is a set of independent Gaussian distributed random variables with a unit 
variance: 

< ViitfiVjitm) >= &ij&{tl - t m ). (9) 

It can be shown that the dynamics based on the Langevin algorithm yields a canonical 
distribution Pb{E, T) oc n(E)e~ /3E . For numerical work one integrates the above equation 
by discretizing the time with step At : 

ft (t + At) = qi (t) + At + ■ (10) 

A straightforward generalization of this technique to simulations in multicanonical 
ensemble can be made by replacing the f3E in Eq. (|8]) by the microcanonical entropy 
S(E): 

dqi 

The above equation now describes a dynamics which will yield a multicanonical distri- 
bution P mu (E) oc n{E)e~ s ^ = const, (see Eq. (f|)). (A similar consideration of multi- 
canonical Langevin algorithm is given in Ref. |17||.) Hence, for actual simulations we use 



the following difference equation: 



ft (t + At) = + At (- + • (12) 

We remark that Eq. ([0]) can be written as 

where ^(-B 1 ) is an energy-dependent effective inverse temperature. In this notation the 
term "multicanonical" becomes obvious (compare Eq. ( |i~3l) with Eq. (§)). 



<?.5. Molecular dynamics algorithm in multicanonical ensemble 
The expectation value of a physical quantity O is calculated by 

/ Dq 0(q)e^ E{q) 
<0> T = J — f : , (14) 

where the integration measure is defined by Dq = Ylf =1 dqi and ft (i = 1, • • • , N) are again 
the (generalized) coordinates of a system. E(q) is the potential energy of the system. The 
above equation is mathematically identical with 



(N 



<°>t = - ; Y =1 , rA ( 15 ) 



J DqDit exp (-g^ 



where we used the notation Dn = nf = iditi. Identifying the auxiliary variables 7Tj with 
the conjugate momenta corresponding to the coordinates qi, we can describe our system 
with a Hamiltonian 

1 N 

H{q,<K) = -Y,< J rPE{q u ---,q N ) , (16) 

Z 8=1 

where we have set all the masses equal to 1 for simplicity. 

The classical molecular dynamics algorithm uses the Hamilton's equations of motion 

dH 

Iff ~dE ( 17 ) 
dft dqi 

to generate representative ensembles of configurations. For numerical work the time is 
discretized with step At and the equations are integrated according to the leapfrog (or 



other time reversible integration) scheme: 

Ar 



qi (t + At) = qi (t) + At m (t + — \ , 

( 3 . \ / At\ pL dE 

7ii[t+ - At = Hi [t H At (3- 



2 J V 2 J ^dqi(t + At) ' 

The initial momenta {fti(^y)} for the iteration are prepared by 

/At\ r , At ~ dE 

* (t) = I - (0) " T a aS(o) • (19) 

with appropriately chosen ^(0) and 7r»(0) (^(0) is from a Gaussian distribution). 

In order to generalize this widely used technique to simulations in multicanonical 
ensemble, we again propose to replace (3E by the entropy S(E) in Eqs. (fl7f) , fll8D , and 
(p!9|) (just as we did for the Langevin algorithm). Hence, we have a new "Hamiltonian" 

1 N 

H(q^) = ^^ + S(E(q)) , (20) 
z i=i 

and a new set of Hamilton's equations of motion 

dH 

\h_ dS(E(q)) dS dE{q) ( 21 ) 
dqi dq { dE dq { 

This is the set of equations we adopt for multicanonical MD simulations. Formally it can 

be understood as a rescaling of the usual force term by the derivative of the entropy. For 

numerical simulations the Hamilton equations are again discretized in time and integrated 

by a leapfrog scheme. 

2.4- Hybrid Monte Carlo algorithm in multicanonical ensemble 

The hybrid Monte Carlo algorithm [|l5fl is based on the combination of molecular dynam- 
ics and Metropolis Monte Carlo algorithms. Namely, each proposal for the Metropolis 
method is prepared by a short MD run starting from the actual configuration. Hence, 
this algorithm is based on a global update, while in the conventional Metropolis method 
one is usually restricted to a local update. Furthermore, the Metropolis step ensures that 
the sampled configurations are distributed according to the chosen ensemble, while con- 
vential molecular dynamics simulations are hampered by difficult-to-control systematic 
errors due to finite step size in the integration of the equations of motion. 
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Given the set of coordinates {q{\ of the previous configuration and choosing the cor- 
responding momenta {7Tj} from a Gaussian distribution, a certain number of MD steps 
are performed to obtain a candidate configuration {g-,7r 4 '}. This candidate is accepted 
according to the Metropolis Monte Carlo criterion with probability 

p = min{l, e -(#(«V)-ir(ff.*))} ; (22) 

where H is the Hamiltonian in Eq. (|16|). The time reversibility of the leapfrog integration 
scheme ensures detailed balance and therefore convergence to the correct distribution. The 
whole process is repeated for a desired number of times (Monte Carlo steps). The number 
of integration (leapfrog) steps Nlf and the size of the time step At are free parameters 
in the hybrid Monte Carlo algorithm, which have to be tuned carefully. A choice of too 
small Nlf and At means that the sampled configurations are too much correlated, while 
too large Nlf (or At) yields high rejection rates. In both cases the algorithm becomes 
inefficient. 

The generalization of this technique to simulations in multicanonical ensemble can 
again be made by replacing the Hamiltonian of Eq. (|T6| ) with the multicanonical Hamil- 
tonian of Eq. (|20D , i.e., replacing j3E by the entropy S(E) in the equations of motion. 



3. RESULTS AND DISCUSSION 

The effectiveness of the algorithms presented in the previous section is tested for the 
system of an oligopeptide, Met-enkephalin. This peptide has the amino-acid sequence 
Tyr-Gly-Gly-Phe-Met. The potential energy function that we used is given by the sum 
of electrostatic term, Lennard- Jones term, and hydrogen-bond term for all pairs of atoms 
in the peptide together with the torsion term for all torsion angles. The parameters for 
the energy function were adopted from ECEPP/2.[|l8|]-f2(| The computer code SMC [21 



was modified to accomodate the multicanonical ensemble. 

For the coordinates {q{\ we used the dihedral angles. ( We remark that it was recently 
claimed that convergence is faster for the dihedral coordinates. p2] Of course we could have 
used Cartesian coordinates as well with the same set of equations.) The peptide-bond 
dihedral angles u were fixed to be 180° for simplicity. This leaves 19 dihedral angles 
as generalized coordinates. By definition of multicanonical ensemble, one cannot obtain 



information on the real dynamics of the system by the MD algorithm, and only static 
thermodynamic quantities can be calculated. For this reason we do not need to consider 



the equations of motion for dihedral space as presented in Ref . [p3| , but can use the much 
simpler form as given in the previous section. However, we remark that this may not be 
the optimal choice. Very often it may be more suitable to distinguish between "soft" and 
"hard" degrees of freedom and introduce appropriately chosen "masses" in the equations 



of motion. 22 



For the multicanonical MD simulations, we made a single production run with the total 
number of time steps Nlf = 400, 000 * 19 and the time-step size At = 0.005 (in arbitary 
units), after the optimal estimate for the multicanonical weight factor w mu (E), or entropy 
S(E), was obtained. For the multicanonical Langevin algorithm, a production run with 
the same number of time steps (Nlf — 400, 000 * 19) as in the MD simulation, but our 
optimal time-step size was only At = 0.0001. This indicates that the simulation moves 
more slowly through phase space, and we expect slower convergence to the multicanonical 
distribution than in MD case. For the multicanonical hybrid Monte Carlo algorithm, 
an MD simulation with 19 leapfrog steps was made for each Monte Carlo step and a 
production run with 200,000 MC steps was made. Since the Metropolis step in hybrid 
Monte Carlo corrects for errors due to the numerical integration of the equation of motion, 
the time-step size can be large for this algorithm. We chose At = 0.01 in our units. The 
initial conformation for all three simulations was the final (and therefore equilibrized) 
conformation obtained from a multicanonical Monte Carlo simulation of 200,000 sweeps, 
following 1,000 sweeps for thermalization with the same weights (in each sweep all of the 
19 angles were updated once). 

In Fig. 1 the time series of the total potential energy are shown for the three mul- 
ticanonical simulations. They all display a random walk in energy as they should for 
a simulation in multicanonical ensemble. All the lowest-energy conformations were es- 
sentially the same (with only a small amount of deviations for each dihedral angle) as 
that of the global-minimum energy conformation previously obtained for the same energy 
function (with uj = 180°) by other methods. p4|, |^, [25] The global-minimum potential 
energy value obtained by minimization is —10.7 kcal/mol.|25|] The random walks of the 
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MD and hybrid MC simulations visited the global-minimum region (E < —10 kcal/mol) 
three times and five times, respectively, while that of the Langevin simulation reached the 
region only once. These visits are separated by the walks towards the high energy region 
much above E — 16 kcal/mol, which corresponds to the average energy at T = 1000 K.|| 

In Fig. 2a the time series of the end-to-end distance r is plotted. Here, the distance was 
measured from N of Tyr 1 to O of Met 5. Only the result from the multicanonical hybrid 
Monte Carlo simulation is given, since the other two simulations give similar results. Note 
that there is a positive correlation between potential energy E and end-to-end distance 
r (compare Figs, lc and 2a), indicating that a folded structure generally has a lower 
potential energy than a stretched one. This becomes even clearer in Fig. 2b, where we 
display the average end-to-end distance r as a function of potential energy E. 

In Fig. 3 we demonstrate that the probability distribution P mu (E) of potential energy 
E obtained from the multicanonical MD simulation is essentially flat (of the same order 
of magnitude) over the whole energy range. Similar figures can be drawn for the other 
two algorithms. 

In Fig. 4 the entropy S(E) calculated from the probability distribution P mu (E) is dis- 
played (see Eqs. (Q) and (Q)). Only the result from multicanonical MD simulation is given, 
since the other two simulations give essentially the same results. It is a monotonically 
increasing function. Note that there is a sudden drop of S(E) near E = —10 kcal/mol, 
suggesting that the global-minimum conformation is "unique" . 

Simulations in multicanonical ensemble can not only find the energy global minimum 
but also any thermodynamic quantity as a function of temperature from a single simula- 
tion run. We have calculated the specific heat and average potential energy as functions 
of temperature for the three algorithms. The results all agreed within errors with those 
from our previous multicanonical MC runs (see, for instance, Refs. [f|, ||). Here, we just 
show another example of such a calculation, the average end-to-end distance as a function 
of temperature. The results are essentially the same for the three algorithms. That from 
multicanonical Langevin algorithm is shown in Fig. 5. We see that the average end-to-end 
distance becomes smaller as the temperature is lowered, indicating that the peptide has 
a compact structure at low temperatures. 



9 



CONCLUSIONS 

In this article we have shown that the multicanonical ansatz is not restricted to Monte 
Carlo simulations, but can also be used in combination with other simulation methods 
such as molecular dynamics, Langevin, and hybrid Monte Carlo algorithms. We have 
tested the performances of these three methods in multicanonical ensemble for a simple 
peptide, Met-enkephalin. The results were comparable to those of the original Monte 
Carlo version. || We believe that there is a wide range of applications for multicanonical 
versions of molecular dynamics and related algorithms. For instance, multicanonical MD 
simulations may prove to be a valuable tool for refinement of the protein structures inferred 
from X-ray and/or NMR experiments. 

Acknowledgements: 

The authors thank A. Kidera and N. Nakajima for informing us that they have also de- 
veloped a method for implementing an MD algorithm in multicanonical ensemble. We are 
grateful to F. Hirata for letting us know the existence of Refs. [ 171 , Our simulations 



were performed on the computers of the Computer Center at the Institute for Molecu- 
lar Science, Okazaki, Japan. This work is supported, in part, by the Grants- in- Aid for 
Scientific Research from the Japanese Ministry of Education, Science, Sports, and Culture. 



References 

[1] S. Kirkpatrick, CD. Gelatt, Jr., and M.P. Vecchi, Science 220 (1983) 671. 

[2] B.A. Berg and T. Neuhaus, Phys. Lett. B267 (1991) 249; Phys. Rev. Lett. 68 
(1992) 9. 

[3] B.A. Berg, Int. J. Mod. Phys. C3 (1992) 1083. 

[4] B.A. Berg and T. Celik, Phys. Rev. Lett. 69 (1992) 2292; B.A. Berg, U.H.E. Hans- 
mann, and T. Celik, Phys. Rev. B50 (1994) 16444. 

[5] U.H.E. Hansmann and Y. Okamoto, J. Comp. Chem. 14 (1993) 1333. 

10 



[6] J. Lee, Phys. Rev. Lett. 71 (1993) 211. 

[7] B.A. Berg, U.H.E. Hansmann, and Y. Okamoto, J. Phys. Chem. 99 (1995) 2236. 

[8] U.H.E. Hansmann and Y. Okamoto, J. Phys. Soc. Jpn. 63 (1994) 3945; Physica A 
212 (1994) 415. 

[9] M.H. Hao and H.A. Scheraga, J. Phys. Chem. 98 (1994) 4940. 

[10] Y. Okamoto, U.H.E. Hansmann, and T. Nakazawa, Chem. Lett. 1995 391; 
Y. Okamoto and U.H.E. Hansmann, J. Phys. Chem. 99 (1995) 2236. 

[11] A. Kidera, Proc. Nat. Acad. Set. U.S.A. 92 (1995) 9886. 

[12] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, J. 
Chem. Phys. 21 (1953) 1087. 

[13] For instance, L. Verlet, Phys. Rev. 159 (1967) 98. 

[14] G. Parisi and Y.-S. Wu, Sci. Sin. 24 (1981) 483. 

[15] S. Duane, A.D. Kennedy, B.J. Pendleton, and D. Roweth, Phys. Lett. B195 (1987) 
216. 

[16] A.M. Ferrenberg and R.H. Swendsen, Phys. Rev. Lett. 61 (1988) 2635; ibid. 63 
(1989) 1658(E), and references given in the erratum. 

[17] T. Munakata and S. Oyama, 'Adaptation and linear response theory", Kyoto Uni- 
versity preprint. 

[18] F.A. Momany, R.F. McGuire, A.W. Burgess, and H.A. Scheraga, J. Phys. Chem. 
79 (1975) 2361. 

[19] G. Nemethy, M.S. Pottle, and H.A. Scheraga, J. Phys. Chem. 87 (1983) 1883. 
[20] M.J. Sippl, G. Nemethy, and H.A. Scheraga, J. Phys. Chem. 88 (1984) 6231. 
[21] The program SMC was written by F. Eisenmenger. 



11 



[22] B.M. Forrest and U.W. Suter, J. Chem. Phys. 101 (1994) 2616. 

[23] A.K. Mazur, V.E. Dorofeev, and R.A. Abagyan, J. Comp. Phys. 92 (1991) 261. 

[24] Y. Okamoto, T. Kikuchi, and H. Kawai, Chem. Lett. 1992 1275. 

[25] H. Meirovitch, E. Meirovitch, A.G. Michel, and M. Vasquez, J. Phys. Chem. 98 
(1994) 6241. 



12 



FIGURE CAPTIONS: 

FIG. 1. (a) Time series of the total potential energy E from a multicanonical Langevin 
simulation of 400,000*19 time steps with step size At = 0.0001. (b) Time series of E from 
a multicanonical molecular dynamics simulation of 400,000*19 time steps with step size 
At = 0.005. (c) Time series of E from a multicanonical hybrid Monte Carlo simulation 
of 200,000 MC steps. For each MC step an MD run of 19 time steps was made with step 
size At = 0.01. 

FIG. 2. (a) Time series of end-to-end distance r from the multicanonical hybrid Monte 
Carlo simulation, (b) The average end-to-end distance r as a function of potential energy 
E obtained from the multicanonical hybrid Monte Carlo simulation. 
Fig. 3. Probability distribution of potential energy E obtained from the multicanonical 
molecular dynamics simulation. 

Fig. 4: Microcanonical entropy S(E) as a function of potential energy E obtained from 
the multicanonical molecular dynamics simulation. 

FIG. 5: The average end-to-end distance r as a function of temperature obtained from 
the multicanonical Langevin simulation. 
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